This video introduces string data structure and its interesting properties.
Codes:
This track of the course covers the topic "Strings".
In details, this track will cover:
Objective: The objective of this track is to familiarize the learners with Strings.
Track Content:
Assessment: All Tracks in every week are associated with weekly contests.
This video introduces string data structure and its interesting properties.
Codes:
Links to the working codes discussed in the Video:
This video introduces String in Java and functions on strings.
Codes:
This video talks about two methods for palindrome checking.
Given two strings, we need to check if these strings are Anagram of each other or not.
Codes:
Given a string, the task is to find the first character (whose leftmost appearance is first) that repeats.
Codes:
Given a string, the task is to find the leftmost character that does not repeat.
Codes:
This video talks about an interesting solution to reverse words in a string.
Codes:
This video talks about overview of pattern searching algorithms (Naive, Improved Naive for Distinct, Rabin Karp and KMP).
Given a pattern and a text, we need to print all occurrences of the pattern in the text. This video talks about O((m+n-1)*m) solution.
Codes:
Given a pattern with distinct characters and a text, we need to print all occurrences of the pattern in the text. This video talks about improved Naive pattern searching with Theta(n) time complexity
Codes:
Rabin Karp Algorithm
Codes:
Two methods of LPS (Longest Proper Prefx Suffix) Array are discussed. One method has time complexity O(n^3) and other method is O(n).
Codes:
Complete KMP algorithm is discussed. This algorithm uses the LPS array.
Codes:
An interesting solution to solve the problem.
Codes:
Given a text and a pattern, the task is to find if there is anagram of pattern present in text. The video talks about two solutions to solve the problem.
Codes:
Given a string, we need to find lexicographic rank of a string
Codes:
In this video three approaches to solve the problem are discussed. O(n^3), O(n^2) and O(n)
Codes:
char str_name[size];In the above syntax, str_name is any name given to the string variable and size is used to define the length of the string, i.e the number of characters strings will store. Please keep in mind that there is an extra terminating character which is the Null character ('\0') used to indicate the termination of string which differs strings from normal character arrays.
1. char str[] = "GeeksforGeeks";
2. char str[50] = "GeeksforGeeks";
3. char str[] = {'G','e','e','k','s','f','o','r','G','e','e','k','s','\0'};
4. char str[14] = {'G','e','e','k','s','f','o','r','G','e','e','k','s','\0'};
Geeks
String is : GeeksforGeeks
std::string Class in C++
C++ has in its definition a way to represent the sequence of characters as an object of a class. This class is called std::string. The String class stores the characters as a sequence of bytes with functionality of allowing access to single byte character.string string_name = "Sample String";
GeeksTo learn about std::string in details, refer: std::string class in C++.
Creating a String
There are two ways to create string in Java:String s = “GeeksforGeeks”;
String s = new String (“GeeksforGeeks”);
"GeeksforGeeks".length(); // returns 13
"GeeksforGeeks".charAt(3); // returns ‘k’
"GeeksforGeeks".substring(3); // returns “ksforGeeks”
"GeeksforGeeks".substring(2, 5); // returns “eks”
String s1 = ”Geeks”;
String s2 = ”forGeeks”;
String output = s1.concat(s2); // returns “GeeksforGeeks”
String s = ”Learn Share Learn”;
int output = s.indexOf(“Share”); // returns 6
String s = ”Learn Share Learn”;
int output = s.indexOf(‘a’,3);// returns 8
String s = ”Learn Share Learn”;
int output = s.lastIndexOf(‘a’); // returns 14
Boolean out = “Geeks”.equals(“Geeks”); // returns true
Boolean out = “Geeks”.equals(“geeks”); // returns false
Boolean out= “Geeks”.equalsIgnoreCase(“Geeks”); // returns true
Boolean out = “Geeks”.equalsIgnoreCase(“geeks”); // returns true
int out = s1.compareTo(s2); // where s1 ans s2 are
// strings to be compared
This returns difference s1-s2. If :
out < 0 // s1 comes before s2
out = 0 // s1 and s2 are equal.
out > 0 // s1 comes after s2.
int out = s1.compareToIgnoreCase(s2);Note- In this case, it will not consider case of a letter (it will ignore whether it is uppercase or lowercase).
// where s1 ans s2 are
// strings to be compared
This returns difference s1-s2. If :
out < 0 // s1 comes before s2
out = 0 // s1 and s2 are equal.
out > 0 // s1 comes after s2.
String word1 = “HeLLo”;
String word3 = word1.toLowerCase(); // returns “hello"
String word1 = “HeLLo”;
String word2 = word1.toUpperCase(); // returns “HELLO”
String word1 = “ Learn Share Learn “;
String word2 = word1.trim(); // returns “Learn Share Learn”
String s1 = “feeksforfeeks“;Note:- s1 is still feeksforfeeks and s2 is geeksgorgeeks
String s2 = “feeksforfeeks”.replace(‘f’ ,’g’); // returns “geeksgorgeeks”
// Java code to illustrate different constructors and methods
// String class.
import java.io.*;
import java.util.*;
class Test
{
public static void main (String[] args)
{
String s= "GeeksforGeeks";
// or String s= new String ("GeeksforGeeks");
// Returns the number of characters in the String.
System.out.println("String length = " + s.length());
// Returns the character at ith index.
System.out.println("Character at 3rd position = "
+ s.charAt(3));
// Return the substring from the ith index character
// to end of string
System.out.println("Substring " + s.substring(3));
// Returns the substring from i to j-1 index.
System.out.println("Substring = " + s.substring(2,5));
// Concatenates string2 to the end of string1.
String s1 = "Geeks";
String s2 = "forGeeks";
System.out.println("Concatenated string = " +
s1.concat(s2));
// Returns the index within the string
// of the first occurrence of the specified string.
String s4 = "Learn Share Learn";
System.out.println("Index of Share " +
s4.indexOf("Share"));
// Returns the index within the string of the
// first occurrence of the specified string,
// starting at the specified index.
System.out.println("Index of a = " +
s4.indexOf('a',3));
// Checking equality of Strings
Boolean out = "Geeks".equals("geeks");
System.out.println("Checking Equality " + out);
out = "Geeks".equals("Geeks");
System.out.println("Checking Equality " + out);
out = "Geeks".equalsIgnoreCase("gEeks ");
System.out.println("Checking Equality " + out);
int out1 = s1.compareTo(s2);
System.out.println("If s1 = s2 " + out);
// Converting cases
String word1 = "GeeKyMe";
System.out.println("Changing to lower Case " +
word1.toLowerCase());
// Converting cases
String word2 = "GeekyME";
System.out.println("Changing to UPPER Case " +
word1.toUpperCase());
// Trimming the word
String word4 = " Learn Share Learn ";
System.out.println("Trim the word " + word4.trim());
// Replacing characters
String str1 = "feeksforfeeks";
System.out.println("Original String " + str1);
String str2 = "feeksforfeeks".replace('f' ,'g') ;
System.out.println("Replaced f with g -> " + str2);
}
}String length = 13
Character at 3rd position = k
Substring ksforGeeks
Substring = eks
Concatenated string = GeeksforGeeks
Index of Share 6
Index of a = 8
Checking Equality false
Checking Equality true
Checking Equality false
If s1 = s2 false
Changing to lower Case geekyme
Changing to UPPER Case GEEKYME
Trim the word Learn Share Learn
Original String feeksforfeeks
Replaced f with g -> geeksgorgeeks
Input: txt[] = "THIS IS A TEST TEXT"Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database, pattern searching algorithms are used to show the search results.
pat[] = "TEST"
Output: Pattern found at index 10
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13
Input: txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"
Output: Pattern found at index 10
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
hash( txt[s+1 .. s+m] ) = ( d ( hash( txt[s .. s+m-1]) - txt[s]*h ) + txt[s + m] ) mod q
Where,
hash( txt[s .. s+m-1] ) : Hash value at shift s.
hash( txt[s+1 .. s+m] ) : Hash value at next shift (or shift s+1)
d: Number of characters in the alphabet
q: A prime number
h: d^(m-1)
/* Following program is a C/C++ implementation
of Rabin Karp Algorithm */
#include<stdio.h>
#include<string.h>
// d is the number of characters
// in the input alphabet
#define d 256
/* pat -> pattern
txt -> text
q -> A prime number
*/
void search(char pat[], char txt[], int q)
{
int M = strlen(pat);
int N = strlen(txt);
int i, j;
int p = 0; // hash value for pattern
int t = 0; // hash value for txt
int h = 1;
// The value of h would be "pow(d, M-1)%q"
for (i = 0; i < M-1; i++)
h = (h*d)%q;
// Calculate the hash value of pattern and first
// window of text
for (i = 0; i < M; i++)
{
p = (d*p + pat[i])%q;
t = (d*t + txt[i])%q;
}
// Slide the pattern over text one by one
for (i = 0; i <= N - M; i++)
{
// Check the hash values of current window of text
// and pattern. If the hash values match then only
// check for characters on by one
if ( p == t )
{
/* Check for characters one by one */
for (j = 0; j < M; j++)
{
if (txt[i+j] != pat[j])
break;
}
// if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]
if (j == M)
printf("Pattern found at index %d \n", i);
}
// Calculate hash value for next window of text: Remove
// leading digit, add trailing digit
if ( i < N-M )
{
t = (d*(t - txt[i]*h) + txt[i+M])%q;
// We might get negative value of t, converting it
// to positive
if (t < 0)
t = (t + q);
}
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "GEEKS FOR GEEKS";
char pat[] = "GEEK";
int q = 101; // A prime number
search(pat, txt, q);
return 0;
}
// Following program is a Java implementation
// of Rabin Karp Algorithm
public class Main
{
// d is the number of characters in
// the input alphabet
public final static int d = 256;
/* pat -> pattern
txt -> text
q -> A prime number
*/
static void search(String pat, String txt, int q)
{
int M = pat.length();
int N = txt.length();
int i, j;
int p = 0; // hash value for pattern
int t = 0; // hash value for txt
int h = 1;
// The value of h would be "pow(d, M-1)%q"
for (i = 0; i < M-1; i++)
h = (h*d)%q;
// Calculate the hash value of pattern and first
// window of text
for (i = 0; i < M; i++)
{
p = (d*p + pat.charAt(i))%q;
t = (d*t + txt.charAt(i))%q;
}
// Slide the pattern over text one by one
for (i = 0; i <= N - M; i++)
{
// Check the hash values of current window of text
// and pattern. If the hash values match then only
// check for characters on by one
if ( p == t )
{
/* Check for characters one by one */
for (j = 0; j < M; j++)
{
if (txt.charAt(i+j) != pat.charAt(j))
break;
}
// if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]
if (j == M)
System.out.println("Pattern found at index " + i);
}
// Calculate hash value for next window of text: Remove
// leading digit, add trailing digit
if ( i < N-M )
{
t = (d*(t - txt.charAt(i)*h) + txt.charAt(i+M))%q;
// We might get negative value of t, converting it
// to positive
if (t < 0)
t = (t + q);
}
}
}
/* Driver program to test above function */
public static void main(String[] args)
{
String txt = "GEEKS FOR GEEKS";
String pat = "GEEK";
int q = 101; // A prime number
search(pat, txt, q);
}
}Pattern found at index 0
Pattern found at index 10
Input: txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"
Output: Pattern found at index 10
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
KMP (Knuth Morris Pratt) Pattern Searching
The Naive pattern searching algorithm doesn’t work well in cases where we see many matching characters followed by a mismatching character. Following are some examples.txt[] = "AAAAAAAAAAAAAAAAAB"The KMP matching algorithm uses degenerating property (pattern having the same sub-patterns appearing more than once in the pattern) of the pattern and improves the worst case complexity to O(n). The basic idea behind KMP’s algorithm is: whenever we detect a mismatch (after some matches), we already know some of the characters in the text of the next window. We take advantage of this information to avoid matching the characters that we know will anyway match. Let us consider the below example to understand this.
pat[] = "AAAAB"
txt[] = "ABABABCABABABCABABABC"
pat[] = "ABABAC" (not a worst case, but a bad case for Naive)
Matching Overview
txt = "AAAAABAAABA"
pat = "AAAA"
We compare first window of txt with pat
txt = "AAAAABAAABA"
pat = "AAAA" [Initial position]
We find a match. This is same as Naive String Matching.
In the next step, we compare next window of txt with pat.
txt = "AAAAABAAABA"
pat = "AAAA" [Pattern shifted one position]
This is where KMP does optimization over Naive. In this
second window, we only compare fourth A of pattern
with fourth character of current window of text to decide
whether current window matches or not. Since we know
first three characters will anyway match, we skipped
matching first three characters.
Need of Preprocessing?
An important question arises from the above explanation,
how to know how many characters to be skipped. To know this,
we pre-process pattern and prepare an integer array
lps[] that tells us the count of characters to be skipped.
lps[i] = the longest proper prefix of pat[0..i]
which is also a suffix of pat[0..i].
Examples of lps[] construction:
For the pattern “AAAA”,
lps[] is [0, 1, 2, 3]
For the pattern “ABCDE”,
lps[] is [0, 0, 0, 0, 0]
For the pattern “AABAACAABAA”,
lps[] is [0, 1, 0, 1, 2, 0, 1, 2, 3, 4, 5]
For the pattern “AAACAAAAAC”,
lps[] is [0, 1, 2, 0, 1, 2, 3, 3, 3, 4]
For the pattern “AAABAAA”,
lps[] is [0, 1, 2, 0, 1, 2, 3]
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
lps[] = {0, 1, 2, 3}
i = 0, j = 0
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++, j++
i = 1, j = 1
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++, j++
i = 2, j = 2
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
pat[i] and pat[j] match, do i++, j++
i = 3, j = 3
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++, j++
i = 4, j = 4
Since j == M, print pattern found and reset j,
j = lps[j-1] = lps[3] = 3
Here unlike Naive algorithm, we do not match first three
characters of this window. Value of lps[j-1] (in above
step) gave us index of next character to match.
i = 4, j = 3
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++, j++
i = 5, j = 4
Since j == M, print pattern found and reset j,
j = lps[j-1] = lps[3] = 3
Again unlike Naive algorithm, we do not match first three
characters of this window. Value of lps[j-1] (in above
step) gave us index of next character to match.
i = 5, j = 3
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] do NOT match and j > 0, change only j
j = lps[j-1] = lps[2] = 2
i = 5, j = 2
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] do NOT match and j > 0, change only j
j = lps[j-1] = lps[1] = 1
i = 5, j = 1
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] do NOT match and j > 0, change only j
j = lps[j-1] = lps[0] = 0
i = 5, j = 0
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] do NOT match and j is 0, we do i++.
i = 6, j = 0
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++ and j++
i = 7, j = 1
txt[] = "AAAAABAAABA"
pat[] = "AAAA"
txt[i] and pat[j] match, do i++ and j++
We continue this way...
// C++ program for implementation of KMP
// pattern searching algorithm
#include <bits/stdc++.h>
using namespace std;
void computeLPSArray(char* pat, int M, int* lps);
// Prints occurrences of txt[] in pat[]
void KMPSearch(char* pat, char* txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int lps[M];
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
int j = 0; // index for pat[]
while (i < N) {
if (pat[j] == txt[i]) {
j++;
i++;
}
if (j == M) {
printf("Found pattern at index %d ", i - j);
j = lps[j - 1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i]) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j - 1];
else
i = i + 1;
}
}
}
// Fills lps[] for given patttern pat[0..M-1]
void computeLPSArray(char* pat, int M, int* lps)
{
// length of the previous longest prefix suffix
int len = 0;
lps[0] = 0; // lps[0] is always 0
// the loop calculates lps[i] for i = 1 to M-1
int i = 1;
while (i < M) {
if (pat[i] == pat[len]) {
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.
if (len != 0) {
len = lps[len - 1];
// Also, note that we do not increment
// i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char txt[] = "ABABDABACDABABCABAB";
char pat[] = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}
// JAVA program for implementation of KMP pattern
// searching algorithm
class KMP_String_Matching {
void KMPSearch(String pat, String txt)
{
int M = pat.length();
int N = txt.length();
// create lps[] that will hold the longest
// prefix suffix values for pattern
int lps[] = new int[M];
int j = 0; // index for pat[]
// Preprocess the pattern (calculate lps[]
// array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
while (i < N) {
if (pat.charAt(j) == txt.charAt(i)) {
j++;
i++;
}
if (j == M) {
System.out.println("Found pattern "
+ "at index " + (i - j));
j = lps[j - 1];
}
// mismatch after j matches
else if (i < N && pat.charAt(j) != txt.charAt(i)) {
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j - 1];
else
i = i + 1;
}
}
}
void computeLPSArray(String pat, int M, int lps[])
{
// length of the previous longest prefix suffix
int len = 0;
int i = 1;
lps[0] = 0; // lps[0] is always 0
// the loop calculates lps[i] for i = 1 to M-1
while (i < M) {
if (pat.charAt(i) == pat.charAt(len)) {
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
// This is tricky. Consider the example.
// AAACAAAA and i = 7. The idea is similar
// to search step.
if (len != 0) {
len = lps[len - 1];
// Also, note that we do not increment
// i here
}
else // if (len == 0)
{
lps[i] = len;
i++;
}
}
}
}
// Driver program to test above function
public static void main(String args[])
{
String txt = "ABABDABACDABABCABAB";
String pat = "ABABCABAB";
new KMP_String_Matching().KMPSearch(pat, txt);
}
}Found pattern at index 10
pat[] = "AAACAAAA"
len = 0, i = 0.
lps[0] is always 0, we move
to i = 1
len = 0, i = 1.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 1, lps[1] = 1, i = 2
len = 1, i = 2.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 2, lps[2] = 2, i = 3
len = 2, i = 3.
Since pat[len] and pat[i] do not match, and len > 0,
set len = lps[len-1] = lps[1] = 1
len = 1, i = 3.
Since pat[len] and pat[i] do not match and len > 0,
len = lps[len-1] = lps[0] = 0
len = 0, i = 3.
Since pat[len] and pat[i] do not match and len = 0,
Set lps[3] = 0 and i = 4.
We know that characters pat
len = 0, i = 4.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 1, lps[4] = 1, i = 5
len = 1, i = 5.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 2, lps[5] = 2, i = 6
len = 2, i = 6.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 3, lps[6] = 3, i = 7
len = 3, i = 7.
Since pat[len] and pat[i] do not match and len > 0,
set len = lps[len-1] = lps[2] = 2
len = 2, i = 7.
Since pat[len] and pat[i] match, do len++,
store it in lps[i] and do i++.
len = 3, lps[7] = 3, i = 8
We will stop here as we have constructed the whole lps[].
Asked In: Amazon
Asked In: Amazon
Asked In: Ola Cabs
A | One |
B | Two |
C | Theta(n) Traversals where n is length of the string |
D | Theta(Log n) Traversals where n is length of the string |